# ENeSy

## Requirement

```
torch==1.10.0
numpy
sklearn
tensorboardX
tqdm
```

## Train the model

The standard datasets used in our work can be found in the public repository of BetaE[1] and can be downloaded [here](http://snap.stanford.edu/betae/KG_data.zip).

### 1st step

Train the embedding of entities and relations with link prediction.

```shell
python main.py --cuda --do_train --do_valid --data_path=data/DATASET -lr=0.00001 --geo=ns --tasks="1p" -kge=RotatE -pre_1p
```

### 2nd step

Train the MLP function which is used to convert symbolic vector to embedding.

```shell
python main.py --cuda --do_train --do_valid --data_path=data/DATASET -lr=0.00001 --geo=ns --tasks="1p" -kge=RotatE -newloss --checkpoint_path=CHECKPOINTPATH --warm_up_steps=STEP
```

### 3rd step(Opt)

Fine-tune the model with complex query data.

```shell
python main.py --cuda --do_train --do_valid --data_path=data/DATASET -lr=0.00001 --geo=ns --tasks="1p.2p.3p.2i.3i.ip.pi.2in.3in.inp.pin.pni.2u.up" -kge=RotatE --checkpoint_path=CHECKPOINTPATH --warm_up_steps=STEP
```

`--geo`: string, select the reasoning model, `vec` for GQE, `box` for Query2box, `beta` for BetaE, `ns` for neural-symbolic.

`--tasks`: string, tasks connected by dot.

`-kge`: string, select the neural reasoning way of projection.

## Test the model

```bash
python main.py --cuda --do_test --data_path=data/DATASET -lr=0.00001 --geo=ns --tasks="1p.2p.3p.2i.3i.ip.pi.2in.3in.inp.pin.pni.2u.up" -kge=RotatE --checkpoint_path=CHECKPOINTPATH -lambdas="0.5;0.5;0.5;0.5;0.5;0.5;0.5;0.5;0.5;0.5;0.5;0.5;0.5;0.5"
```

`-lambdas`: string, lambda used for ensemble prediction of each task connected by semicolon.

## Demonstration for FB15k-237

```shell
# 1st step
python main.py --cuda --do_train --do_valid --data_path=data/FB15k-237-betae -lr=0.0001 -d=1024 --geo=ns --tasks="1p" -kge=RotatE -pre_1p

# 2nd step
python main.py --cuda --do_train --do_valid --data_path=data/FB15k-237-betae -lr=0.00001 -d=1024 -b=32 --geo=ns --tasks="1p" -kge=RotatE -newloss --checkpoint_path=logs/FB15k-237-betae/1p/ns/g-24.0-mode-RotatE/DATE --warm_up_steps=650000

# 3rd step
python main.py --cuda --do_train --do_valid --data_path=data/FB15k-237-betae -lr=0.0000002 -d=1024 -b=26 --geo=ns --tasks="1p.2p.3p.2i.3i.ip.pi.2in.3in.inp.pin.pni.2u.up" -kge=RotatE --checkpoint_path=logs/FB15k-237-betae/1p/ns/g-24.0-mode-RotatE/DATE --warm_up_steps=1100000

# test
python main.py --cuda --do_test --data_path=data/FB15k-237-betae -lr=0.00001 --geo=ns -d=1024 --tasks="1p.2p.3p.2i.3i.ip.pi.2in.3in.inp.pin.pni.2u.up" -kge=RotatE --checkpoint_path=logs/FB15k-237-betae/1p/ns/g-24.0-mode-RotatE/DATE -lambdas="0.4;0.6;0.4;0.15;0.35;0.1;0.25;0.1;0.1;0.05;0;0.05;0.6;0.8"
```

[1] Beta Embeddings for Multi-Hop Logical Reasoning in Knowledge Graphs, Hongyu Ren and Jure Leskovec, NeurIPS 2020.
